Related papers: What Makes a Code Review Useful to OpenDev Developers? An Empirical Investigation

What Makes a Code Review Useful to OpenDev Developers? An Empirical Investigation

URL: http://arxiv.org/abs/2302.11686v2
Date: Mon, 19 Jun 2023 19:53:15 GMT
Title: What Makes a Code Review Useful to OpenDev Developers? An Empirical Investigation
Authors: Asif Kamal Turzo and Amiangshu Bosu
Abstract summary: Even a minor improvement in the effectiveness of Code Reviews can incur significant savings for a software development organization. This study aims to develop a finer grain understanding of what makes a code review comment useful to OSS developers.
Score: 4.061135251278187
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Context: Due to the association of significant efforts, even a minor improvement in the effectiveness of Code Reviews(CR) can incur significant savings for a software development organization. Aim: This study aims to develop a finer grain understanding of what makes a code review comment useful to OSS developers, to what extent a code review comment is considered useful to them, and how various contextual and participant-related factors influence its usefulness level. Method: On this goal, we have conducted a three-stage mixed-method study. We randomly selected 2,500 CR comments from the OpenDev Nova project and manually categorized the comments. We designed a survey of OpenDev developers to better understand their perspectives on useful CRs. Combining our survey-obtained scores with our manually labeled dataset, we trained two regression models - one to identify factors that influence the usefulness of CR comments and the other to identify factors that improve the odds of `Functional' defect identification over the others. Key findings: The results of our study suggest that a CR comment's usefulness is dictated not only by its technical contributions such as defect findings or quality improvement tips but also by its linguistic characteristics such as comprehensibility and politeness. While a reviewer's coding experience positively associates with CR usefulness, the number of mutual reviews, comment volume in a file, the total number of lines added /modified, and CR interval has the opposite associations. While authorship and reviewership experiences for the files under review have been the most popular attributes for reviewer recommendation systems, we do not find any significant association of those attributes with CR usefulness.

Related papers

LazyReview A Dataset for Uncovering Lazy Thinking in NLP Peer Reviews [74.87393214734114]
This work introduces LazyReview, a dataset of peer-review sentences annotated with fine-grained lazy thinking categories. Large Language Models (LLMs) struggle to detect these instances in a zero-shot setting. instruction-based fine-tuning on our dataset significantly boosts performance by 10-20 performance points.
arXiv Detail & Related papers (2025-04-15T10:07:33Z)
Identifying Aspects in Peer Reviews [61.374437855024844]
We develop a data-driven schema for deriving fine-grained aspects from a corpus of peer reviews. We introduce a dataset of peer reviews augmented with aspects and show how it can be used for community-level review analysis.
arXiv Detail & Related papers (2025-04-09T14:14:42Z)
CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models [97.18215355266143]
We introduce a holistic code critique benchmark for Large Language Models (LLMs) called CodeCriticBench. Specifically, our CodeCriticBench includes two mainstream code tasks (i.e., code generation and code QA) with different difficulties. Besides, the evaluation protocols include basic critique evaluation and advanced critique evaluation for different characteristics.
arXiv Detail & Related papers (2025-02-23T15:36:43Z)
Harnessing Large Language Models for Curated Code Reviews [2.5944208050492183]
In code review, generating structured and relevant comments is crucial for identifying code issues and facilitating accurate code changes. Existing code review datasets are often noisy and unrefined, posing limitations to the learning potential of AI models. We propose a curation pipeline designed to enhance the quality of the largest publicly available code review dataset.
arXiv Detail & Related papers (2025-02-05T18:15:09Z)
Hold On! Is My Feedback Useful? Evaluating the Usefulness of Code Review Comments [0.0]
This paper investigates the usefulness of Code Review Comments (CR comments) through textual feature-based and featureless approaches. Our models outperform the baseline by achieving state-of-the-art performance. Our analyses portray the similarities and differences of domains, projects, datasets, models, and features for predicting the usefulness of CR comments.
arXiv Detail & Related papers (2025-01-12T07:22:13Z)
Can Large Language Models Serve as Evaluators for Code Summarization? [47.21347974031545]
Large Language Models (LLMs) serve as effective evaluators for code summarization methods. LLMs prompt an agent to play diverse roles, such as code reviewer, code author, code editor, and system analyst. CODERPE achieves an 81.59% Spearman correlation with human evaluations, outperforming the existing BERTScore metric by 17.27%.
arXiv Detail & Related papers (2024-12-02T09:56:18Z)
Understanding Code Understandability Improvements in Code Reviews [79.16476505761582]
We analyzed 2,401 code review comments from Java open-source projects on GitHub. 83.9% of suggestions for improvement were accepted and integrated, with fewer than 1% later reverted.
arXiv Detail & Related papers (2024-10-29T12:21:23Z)
Knowledge-Guided Prompt Learning for Request Quality Assurance in Public Code Review [15.019556560416403]
Public Code Review (PCR) is an assistant to the internal code review of the development team. We propose a Knowledge-guided Prompt learning for Public Code Review to achieve developer-based code review request quality assurance.
arXiv Detail & Related papers (2024-10-29T02:48:41Z)
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution [74.41064280094064]
textbfJudger-1 is the first open-source textbfall-in-one judge LLM. CompassJudger-1 is a general-purpose LLM that demonstrates remarkable versatility. textbfJudgerBench is a new benchmark that encompasses various subjective evaluation tasks.
arXiv Detail & Related papers (2024-10-21T17:56:51Z)
Leveraging Reviewer Experience in Code Review Comment Generation [11.224317228559038]
We train deep learning models to imitate human reviewers in providing natural language code reviews. The quality of the model generated reviews remain sub-optimal due to the quality of the open-source code review data used in model training. We propose a suite of experience-aware training methods that utilise the reviewers' past authoring and reviewing experiences as signals for review quality.
arXiv Detail & Related papers (2024-09-17T07:52:50Z)
Team-related Features in Code Review Prediction Models [10.576931077314887]
We evaluate the prediction power of features related to code ownership, workload, and team relationship. Our results show that, individually, features related to code ownership have the best prediction power. We conclude that all proposed features together with lines of code can make the best predictions for both reviewer participation and amount of feedback.
arXiv Detail & Related papers (2023-12-11T09:30:09Z)
Towards Automated Classification of Code Review Feedback to Support Analytics [4.423428708304586]
This study aims to develop an automated code review comment classification system. We trained and evaluated supervised learning-based DNN models leveraging code context, comment text, and a set of code metrics. Our approach outperforms Fregnan et al.'s approach by achieving 18.7% higher accuracy.
arXiv Detail & Related papers (2023-07-07T21:53:20Z)
Exploring the Advances in Identifying Useful Code Review Comments [0.0]
This paper reflects the evolution of research on the usefulness of code review comments. It examines papers that define the usefulness of code review comments, mine and annotate datasets, study developers' perceptions, analyze factors from different aspects, and use machine learning classifiers to automatically predict the usefulness of code review comments.
arXiv Detail & Related papers (2023-07-03T00:41:20Z)
SIFN: A Sentiment-aware Interactive Fusion Network for Review-based Item Recommendation [48.1799451277808]
We propose a Sentiment-aware Interactive Fusion Network (SIFN) for review-based item recommendation. We first encode user/item reviews via BERT and propose a light-weighted sentiment learner to extract semantic features of each review. Then, we propose a sentiment prediction task that guides the sentiment learner to extract sentiment-aware features via explicit sentiment labels.
arXiv Detail & Related papers (2021-08-18T08:04:38Z)
Deep Just-In-Time Inconsistency Detection Between Comments and Source Code [51.00904399653609]
In this paper, we aim to detect whether a comment becomes inconsistent as a result of changes to the corresponding body of code. We develop a deep-learning approach that learns to correlate a comment with code changes. We show the usefulness of our approach by combining it with a comment update model to build a more comprehensive automatic comment maintenance system.
arXiv Detail & Related papers (2020-10-04T16:49:28Z)
A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss [51.448615489097236]
Acquiring accurate summarization and sentiment from user reviews is an essential component of modern e-commerce platforms. We propose a novel dual-view model that jointly improves the performance of these two tasks. Experiment results on four real-world datasets from different domains demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2020-06-02T13:34:11Z)
Code Review in the Classroom [57.300604527924015]
Young developers in a classroom setting provide a clear picture of the potential favourable and problematic areas of the code review process. Their feedback suggests that the process has been well received with some points to better the process. This paper can be used as guidelines to perform code reviews in the classroom.
arXiv Detail & Related papers (2020-04-19T06:07:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.