Related papers: The Value of Effective Pull Request Description

The Value of Effective Pull Request Description

URL: http://arxiv.org/abs/2602.14611v1
Date: Mon, 16 Feb 2026 10:15:21 GMT
Title: The Value of Effective Pull Request Description
Authors: Shirin Pirouzkhah, Pavlína Wurzel Gonçalves, Alberto Bacchelli,
Abstract summary: In pull-based development, code contributions are submitted as pull requests (PRs) to undergo reviews and approval by other developers.<n>We conducted a grey literature review of guidelines on writing PR descriptions and derived a taxonomy of eight recommended elements.<n>We analyzed 80K GitHub PRs across 156 projects and five programming languages to assess associations between these elements and code review outcomes.
Score: 5.518378568494161
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In the pull-based development model, code contributions are submitted as pull requests (PRs) to undergo reviews and approval by other developers with the goal of being merged into the code base. A PR can be supported by a description, whose role has not yet been systematically investigated. To fill in this gap, we conducted a mixed-methods empirical study of PR descriptions. We conducted a grey literature review of guidelines on writing PR descriptions and derived a taxonomy of eight recommended elements. Using this taxonomy, we analyzed 80K GitHub PRs across 156 projects and five programming languages to assess associations between these elements and code review outcomes (e.g., merge decision, latency, first response time, review comments, and review iteration cycles). To complement these results, we surveyed 64 developers about the perceived importance of each element. Finally, we analyzed which submission-time factors predict whether PRs include a description and which elements they contain. We found that developers view PR descriptions as important, but their elements matter differently: purpose and code explanations are valued by developers for preserving the rationale and history of changes, while stating the desired feedback type best predicts change acceptance and reviewer engagement. PR descriptions are also more common in mature projects and complex changes, suggesting they are written when most useful rather than as a formality.

Related papers

Understanding Dominant Themes in Reviewing Agentic AI-authored Code [6.183483850365225]
We analyze 19,450 inline review comments spanning 3,177 agent-authored PRs from real-world GitHub repositories.<n>We find that while AI agents can accelerate code production, there remain gaps requiring targeted human review oversight.
arXiv Detail & Related papers (2026-01-27T07:21:09Z)
How AI Coding Agents Modify Code: A Large-Scale Study of GitHub Pull Requests [0.0]
We analyze 24,014 merged Agentic PRs (440,295 commits) and 5,081 merged Human PRs (23,242 commits)<n>Agentic PRs differ substantially from Human PRs in commit count (Cliff's $= 0.5429$) and show moderate differences in files touched and deleted lines.<n>These findings provide a large-scale empirical characterization of how AI coding agents contribute to open source development.
arXiv Detail & Related papers (2026-01-24T20:27:04Z)
Identifying Aspects in Peer Reviews [59.02879434536289]
We develop a data-driven schema for deriving aspects from a corpus of peer reviews.<n>We introduce a dataset of peer reviews augmented with aspects and show how it can be used for community-level review analysis.
arXiv Detail & Related papers (2025-04-09T14:14:42Z)
Hold On! Is My Feedback Useful? Evaluating the Usefulness of Code Review Comments [0.0]
This paper investigates the usefulness of Code Review Comments (CR comments) through textual feature-based and featureless approaches.<n>Our models outperform the baseline by achieving state-of-the-art performance.<n>Our analyses portray the similarities and differences of domains, projects, datasets, models, and features for predicting the usefulness of CR comments.
arXiv Detail & Related papers (2025-01-12T07:22:13Z)
Knowledge-Guided Prompt Learning for Request Quality Assurance in Public Code Review [10.11544476732565]
We propose K nowledge-guided P rompt learning for P ublic Code Review (KP-PCR) to achieve developer-based code review request quality assurance.<n> Experimental results on the PCR dataset for the period 2011-2023 demonstrate that our KP-PCR outperforms baselines.
arXiv Detail & Related papers (2024-10-29T02:48:41Z)
Comments as Natural Logic Pivots: Improve Code Generation via Comment Perspective [85.48043537327258]
We propose MANGO (comMents As Natural loGic pivOts), including a comment contrastive training strategy and a corresponding logical comment decoding strategy. Results indicate that MANGO significantly improves the code pass rate based on the strong baselines. The robustness of the logical comment decoding strategy is notably higher than the Chain-of-thoughts prompting.
arXiv Detail & Related papers (2024-04-11T08:30:46Z)
Explaining Explanation: An Empirical Study on Explanation in Code Reviews [17.005837826213416]
We study the types of explanations used in code reviews and explore the potential of Large Language Models (LLMs) We extracted 793 code review comments from Gerrit and manually labeled them based on whether they contained a suggestion, an explanation, or both. Our analysis shows that 42% of comments only include suggestions without explanations.
arXiv Detail & Related papers (2023-11-15T15:08:38Z)
What Makes a Code Review Useful to OpenDev Developers? An Empirical Investigation [4.061135251278187]
Even a minor improvement in the effectiveness of Code Reviews can incur significant savings for a software development organization. This study aims to develop a finer grain understanding of what makes a code review comment useful to OSS developers.
arXiv Detail & Related papers (2023-02-22T22:48:27Z)
SIFN: A Sentiment-aware Interactive Fusion Network for Review-based Item Recommendation [48.1799451277808]
We propose a Sentiment-aware Interactive Fusion Network (SIFN) for review-based item recommendation. We first encode user/item reviews via BERT and propose a light-weighted sentiment learner to extract semantic features of each review. Then, we propose a sentiment prediction task that guides the sentiment learner to extract sentiment-aware features via explicit sentiment labels.
arXiv Detail & Related papers (2021-08-18T08:04:38Z)
Deep Just-In-Time Inconsistency Detection Between Comments and Source Code [51.00904399653609]
In this paper, we aim to detect whether a comment becomes inconsistent as a result of changes to the corresponding body of code. We develop a deep-learning approach that learns to correlate a comment with code changes. We show the usefulness of our approach by combining it with a comment update model to build a more comprehensive automatic comment maintenance system.
arXiv Detail & Related papers (2020-10-04T16:49:28Z)
A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss [51.448615489097236]
Acquiring accurate summarization and sentiment from user reviews is an essential component of modern e-commerce platforms. We propose a novel dual-view model that jointly improves the performance of these two tasks. Experiment results on four real-world datasets from different domains demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2020-06-02T13:34:11Z)
Automating App Review Response Generation [67.58267006314415]
We propose a novel approach RRGen that automatically generates review responses by learning knowledge relations between reviews and their responses. Experiments on 58 apps and 309,246 review-response pairs highlight that RRGen outperforms the baselines by at least 67.4% in terms of BLEU-4.
arXiv Detail & Related papers (2020-02-10T05:23:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.