What Makes a Code Review Useful to OpenDev Developers? An Empirical
  Investigation
        - URL: http://arxiv.org/abs/2302.11686v2
- Date: Mon, 19 Jun 2023 19:53:15 GMT
- Title: What Makes a Code Review Useful to OpenDev Developers? An Empirical
  Investigation
- Authors: Asif Kamal Turzo and Amiangshu Bosu
- Abstract summary: Even a minor improvement in the effectiveness of Code Reviews can incur significant savings for a software development organization.
This study aims to develop a finer grain understanding of what makes a code review comment useful to OSS developers.
- Score: 4.061135251278187
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract:   Context: Due to the association of significant efforts, even a minor
improvement in the effectiveness of Code Reviews(CR) can incur significant
savings for a software development organization. Aim: This study aims to
develop a finer grain understanding of what makes a code review comment useful
to OSS developers, to what extent a code review comment is considered useful to
them, and how various contextual and participant-related factors influence its
usefulness level. Method: On this goal, we have conducted a three-stage
mixed-method study. We randomly selected 2,500 CR comments from the OpenDev
Nova project and manually categorized the comments. We designed a survey of
OpenDev developers to better understand their perspectives on useful CRs.
Combining our survey-obtained scores with our manually labeled dataset, we
trained two regression models - one to identify factors that influence the
usefulness of CR comments and the other to identify factors that improve the
odds of `Functional' defect identification over the others. Key findings: The
results of our study suggest that a CR comment's usefulness is dictated not
only by its technical contributions such as defect findings or quality
improvement tips but also by its linguistic characteristics such as
comprehensibility and politeness. While a reviewer's coding experience
positively associates with CR usefulness, the number of mutual reviews, comment
volume in a file, the total number of lines added /modified, and CR interval
has the opposite associations. While authorship and reviewership experiences
for the files under review have been the most popular attributes for reviewer
recommendation systems, we do not find any significant association of those
attributes with CR usefulness.
 
      
        Related papers
        - Code Review as Decision-Making -- Building a Cognitive Model from the   Questions Asked During Code Review [2.8299846354183953]
 We build a cognitive model of code review bottom up through thematic, statistical, temporal, and sequential analysis of the transcribed material.<n>The model shows how developers move through two phases during the code review; first an orientation phase to establish context and rationale, then an analytical phase to understand, assess, and plan the rest of the review.
 arXiv  Detail & Related papers  (2025-07-13T14:04:16Z)
- Training Language Model to Critique for Better Refinement [58.73039433159486]
 We introduce textbfRefinement-oriented textbfCritique textbfOptimization (RCO), a novel framework designed to train critic models using refinement signals.<n>RCO uses a feedback loop where critiques, generated by the critic model, guide the actor model in refining its responses.<n>By focusing on critiques that lead to better refinements, RCO eliminates the need for direct critique preference assessment.
 arXiv  Detail & Related papers  (2025-06-27T12:10:57Z)
- Leveraging Reward Models for Guiding Code Review Comment Generation [13.306560805316103]
 Code review is a crucial component of modern software development, involving the evaluation of code quality, providing feedback on potential issues, and refining the code to address identified problems.<n>Deep learning techniques are able to tackle the generative aspect of code review, by commenting on a given code as a human reviewer would do.<n>In this paper, we introduce CoRAL, a deep learning framework automating review comment generation by exploiting reinforcement learning with a reward mechanism.
 arXiv  Detail & Related papers  (2025-06-04T21:31:38Z)
- LazyReview A Dataset for Uncovering Lazy Thinking in NLP Peer Reviews [74.87393214734114]
 This work introduces LazyReview, a dataset of peer-review sentences annotated with fine-grained lazy thinking categories.
Large Language Models (LLMs) struggle to detect these instances in a zero-shot setting.
 instruction-based fine-tuning on our dataset significantly boosts performance by 10-20 performance points.
 arXiv  Detail & Related papers  (2025-04-15T10:07:33Z)
- Identifying Aspects in Peer Reviews [61.374437855024844]
 We develop a data-driven schema for deriving fine-grained aspects from a corpus of peer reviews.
We introduce a dataset of peer reviews augmented with aspects and show how it can be used for community-level review analysis.
 arXiv  Detail & Related papers  (2025-04-09T14:14:42Z)
- CodeCriticBench: A Holistic Code Critique Benchmark for Large Language   Models [97.18215355266143]
 We introduce a holistic code critique benchmark for Large Language Models (LLMs) called CodeCriticBench.
Specifically, our CodeCriticBench includes two mainstream code tasks (i.e., code generation and code QA) with different difficulties.
Besides, the evaluation protocols include basic critique evaluation and advanced critique evaluation for different characteristics.
 arXiv  Detail & Related papers  (2025-02-23T15:36:43Z)
- Harnessing Large Language Models for Curated Code Reviews [2.5944208050492183]
 In code review, generating structured and relevant comments is crucial for identifying code issues and facilitating accurate code changes.
Existing code review datasets are often noisy and unrefined, posing limitations to the learning potential of AI models.
We propose a curation pipeline designed to enhance the quality of the largest publicly available code review dataset.
 arXiv  Detail & Related papers  (2025-02-05T18:15:09Z)
- Hold On! Is My Feedback Useful? Evaluating the Usefulness of Code Review   Comments [0.0]
 This paper investigates the usefulness of Code Review Comments (CR comments) through textual feature-based and featureless approaches.
Our models outperform the baseline by achieving state-of-the-art performance.
Our analyses portray the similarities and differences of domains, projects, datasets, models, and features for predicting the usefulness of CR comments.
 arXiv  Detail & Related papers  (2025-01-12T07:22:13Z)
- Can Large Language Models Serve as Evaluators for Code Summarization? [47.21347974031545]
 Large Language Models (LLMs) serve as effective evaluators for code summarization methods.
LLMs prompt an agent to play diverse roles, such as code reviewer, code author, code editor, and system analyst.
 CODERPE achieves an 81.59% Spearman correlation with human evaluations, outperforming the existing BERTScore metric by 17.27%.
 arXiv  Detail & Related papers  (2024-12-02T09:56:18Z)
- Understanding Code Understandability Improvements in Code Reviews [79.16476505761582]
 We analyzed 2,401 code review comments from Java open-source projects on GitHub.
83.9% of suggestions for improvement were accepted and integrated, with fewer than 1% later reverted.
 arXiv  Detail & Related papers  (2024-10-29T12:21:23Z)
- Knowledge-Guided Prompt Learning for Request Quality Assurance in Public   Code Review [15.019556560416403]
 Public Code Review (PCR) is an assistant to the internal code review of the development team.
We propose a Knowledge-guided Prompt learning for Public Code Review to achieve developer-based code review request quality assurance.
 arXiv  Detail & Related papers  (2024-10-29T02:48:41Z)
- CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and   Evolution [74.41064280094064]
 textbfJudger-1 is the first open-source textbfall-in-one judge LLM.
 CompassJudger-1 is a general-purpose LLM that demonstrates remarkable versatility.
textbfJudgerBench is a new benchmark that encompasses various subjective evaluation tasks.
 arXiv  Detail & Related papers  (2024-10-21T17:56:51Z)
- Leveraging Reviewer Experience in Code Review Comment Generation [11.224317228559038]
 We train deep learning models to imitate human reviewers in providing natural language code reviews.
The quality of the model generated reviews remain sub-optimal due to the quality of the open-source code review data used in model training.
We propose a suite of experience-aware training methods that utilise the reviewers' past authoring and reviewing experiences as signals for review quality.
 arXiv  Detail & Related papers  (2024-09-17T07:52:50Z)
- Team-related Features in Code Review Prediction Models [10.576931077314887]
 We evaluate the prediction power of features related to code ownership, workload, and team relationship.
Our results show that, individually, features related to code ownership have the best prediction power.
We conclude that all proposed features together with lines of code can make the best predictions for both reviewer participation and amount of feedback.
 arXiv  Detail & Related papers  (2023-12-11T09:30:09Z)
- Towards Automated Classification of Code Review Feedback to Support
  Analytics [4.423428708304586]
 This study aims to develop an automated code review comment classification system.
We trained and evaluated supervised learning-based DNN models leveraging code context, comment text, and a set of code metrics.
Our approach outperforms Fregnan et al.'s approach by achieving 18.7% higher accuracy.
 arXiv  Detail & Related papers  (2023-07-07T21:53:20Z)
- Exploring the Advances in Identifying Useful Code Review Comments [0.0]
 This paper reflects the evolution of research on the usefulness of code review comments.
It examines papers that define the usefulness of code review comments, mine and annotate datasets, study developers' perceptions, analyze factors from different aspects, and use machine learning classifiers to automatically predict the usefulness of code review comments.
 arXiv  Detail & Related papers  (2023-07-03T00:41:20Z)
- SIFN: A Sentiment-aware Interactive Fusion Network for Review-based Item
  Recommendation [48.1799451277808]
 We propose a Sentiment-aware Interactive Fusion Network (SIFN) for review-based item recommendation.
We first encode user/item reviews via BERT and propose a light-weighted sentiment learner to extract semantic features of each review.
Then, we propose a sentiment prediction task that guides the sentiment learner to extract sentiment-aware features via explicit sentiment labels.
 arXiv  Detail & Related papers  (2021-08-18T08:04:38Z)
- Deep Just-In-Time Inconsistency Detection Between Comments and Source
  Code [51.00904399653609]
 In this paper, we aim to detect whether a comment becomes inconsistent as a result of changes to the corresponding body of code.
We develop a deep-learning approach that learns to correlate a comment with code changes.
We show the usefulness of our approach by combining it with a comment update model to build a more comprehensive automatic comment maintenance system.
 arXiv  Detail & Related papers  (2020-10-04T16:49:28Z)
- A Unified Dual-view Model for Review Summarization and Sentiment
  Classification with Inconsistency Loss [51.448615489097236]
 Acquiring accurate summarization and sentiment from user reviews is an essential component of modern e-commerce platforms.
We propose a novel dual-view model that jointly improves the performance of these two tasks.
Experiment results on four real-world datasets from different domains demonstrate the effectiveness of our model.
 arXiv  Detail & Related papers  (2020-06-02T13:34:11Z)
- Code Review in the Classroom [57.300604527924015]
 Young developers in a classroom setting provide a clear picture of the potential favourable and problematic areas of the code review process.
Their feedback suggests that the process has been well received with some points to better the process.
This paper can be used as guidelines to perform code reviews in the classroom.
 arXiv  Detail & Related papers  (2020-04-19T06:07:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.