Related papers: Using Large-scale Heterogeneous Graph Representation Learning for Code Review Recommendations

Using Large-scale Heterogeneous Graph Representation Learning for Code Review Recommendations

URL: http://arxiv.org/abs/2202.02385v1
Date: Fri, 4 Feb 2022 20:58:54 GMT
Title: Using Large-scale Heterogeneous Graph Representation Learning for Code Review Recommendations
Authors: Jiyang Zhang, Chandra Maddila, Ram Bairi, Christian Bird, Ujjwal Raizada, Apoorva Agrawal, Yamini Jhawar, Kim Herzig, Arie van Deursen
Abstract summary: We present CORAL, a novel approach to reviewer recommendation. We use a socio-technical graph built from the rich set of entities. We show that CORAL is able to model the manual history of reviewer selection remarkably well.
Score: 7.260832843615661
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Code review is an integral part of any mature software development process, and identifying the best reviewer for a code change is a well accepted problem within the software engineering community. Selecting a reviewer who lacks expertise and understanding can slow development or result in more defects. To date, most reviewer recommendation systems rely primarily on historical file change and review information; those who changed or reviewed a file in the past are the best positioned to review in the future. We posit that while these approaches are able to identify and suggest qualified reviewers, they may be blind to reviewers who have the needed expertise and have simply never interacted with the changed files before. To address this, we present CORAL, a novel approach to reviewer recommendation that leverages a socio-technical graph built from the rich set of entities (developers, repositories, files, pull requests, work-items, etc.) and their relationships in modern source code management systems. We employ a graph convolutional neural network on this graph and train it on two and a half years of history on 332 repositories. We show that CORAL is able to model the manual history of reviewer selection remarkably well. Further, based on an extensive user study, we demonstrate that this approach identifies relevant and qualified reviewers who traditional reviewer recommenders miss, and that these developers desire to be included in the review process. Finally, we find that "classical" reviewer recommendation systems perform better on smaller (in terms of developers) software projects while CORAL excels on larger projects, suggesting that there is "no one model to rule them all."

Related papers

SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving [90.32201622392137]
We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs)<n>Unlike traditional static benchmarks, SwingArena models the collaborative process of software by pairing LLMs as iterations, who generate patches, and reviewers, who create test cases and verify the patches through continuous integration (CI) pipelines.
arXiv Detail & Related papers (2025-05-29T18:28:02Z)
LazyReview A Dataset for Uncovering Lazy Thinking in NLP Peer Reviews [74.87393214734114]
This work introduces LazyReview, a dataset of peer-review sentences annotated with fine-grained lazy thinking categories. Large Language Models (LLMs) struggle to detect these instances in a zero-shot setting. instruction-based fine-tuning on our dataset significantly boosts performance by 10-20 performance points.
arXiv Detail & Related papers (2025-04-15T10:07:33Z)
Deep Learning-based Code Reviews: A Paradigm Shift or a Double-Edged Sword? [14.970843824847956]
We run a controlled experiment with 29 experts who reviewed different programs with/without the support of an automatically generated code review.<n>We show that reviewers consider valid most of the issues automatically identified by the LLM and that the availability of an automated review as a starting point strongly influences their behavior.<n>The reviewers who started from an automated review identified a higher number of low-severity issues while, however, not identifying more high-severity issues as compared to a completely manual process.
arXiv Detail & Related papers (2024-11-18T09:24:01Z)
Understanding Code Understandability Improvements in Code Reviews [79.16476505761582]
We analyzed 2,401 code review comments from Java open-source projects on GitHub. 83.9% of suggestions for improvement were accepted and integrated, with fewer than 1% later reverted.
arXiv Detail & Related papers (2024-10-29T12:21:23Z)
Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [60.84912551069379]
We present the Code-Development Benchmark (Codev-Bench), a fine-grained, real-world, repository-level, and developer-centric evaluation framework. Codev-Agent is an agent-based system that automates repository crawling, constructs execution environments, extracts dynamic calling chains from existing unit tests, and generates new test samples to avoid data leakage.
arXiv Detail & Related papers (2024-10-02T09:11:10Z)
Leveraging Reviewer Experience in Code Review Comment Generation [11.224317228559038]
We train deep learning models to imitate human reviewers in providing natural language code reviews. The quality of the model generated reviews remain sub-optimal due to the quality of the open-source code review data used in model training. We propose a suite of experience-aware training methods that utilise the reviewers' past authoring and reviewing experiences as signals for review quality.
arXiv Detail & Related papers (2024-09-17T07:52:50Z)
Improving Automated Code Reviews: Learning from Experience [12.573740138977065]
This study investigates whether higher-quality reviews can be generated from automated code review models. We find that experience-aware oversampling can increase the correctness, level of information, and meaningfulness of reviews.
arXiv Detail & Related papers (2024-02-06T07:48:22Z)
Factoring Expertise, Workload, and Turnover into Code Review Recommendation [4.492444446637857]
We show that code review natural spreads knowledge thereby reducing the files at risk to turnover. We develop novel recommenders to understand their impact on the level of expertise during review. We are able to globally increase expertise during reviews, +3%, reduce workload concentration, -12%, and reduce the files at risk, -28%.
arXiv Detail & Related papers (2023-12-28T18:58:06Z)
Improving Code Reviewer Recommendation: Accuracy, Latency, Workload, and Bystanders [6.538051328482194]
We build upon the recommender that has been in production since 2018 RevRecV1. We find that reviewers were being assigned based on prior authorship of files. Having an individual who is responsible for the review, reduces the time take for reviews by -11%.
arXiv Detail & Related papers (2023-12-28T17:55:13Z)
Team-related Features in Code Review Prediction Models [10.576931077314887]
We evaluate the prediction power of features related to code ownership, workload, and team relationship. Our results show that, individually, features related to code ownership have the best prediction power. We conclude that all proposed features together with lines of code can make the best predictions for both reviewer participation and amount of feedback.
arXiv Detail & Related papers (2023-12-11T09:30:09Z)
ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification. A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors. Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z)
A Survey on Neural Recommendation: From Collaborative Filtering to Content and Context Enriched Recommendation [70.69134448863483]
Research in recommendation has shifted to inventing new recommender models based on neural networks. In recent years, we have witnessed significant progress in developing neural recommender models.
arXiv Detail & Related papers (2021-04-27T08:03:52Z)
Can We Automate Scientific Reviewing? [89.50052670307434]
We discuss the possibility of using state-of-the-art natural language processing (NLP) models to generate first-pass peer reviews for scientific papers. We collect a dataset of papers in the machine learning domain, annotate them with different aspects of content covered in each review, and train targeted summarization models that take in papers to generate reviews. Comprehensive experimental results show that system-generated reviews tend to touch upon more aspects of the paper than human-written reviews.
arXiv Detail & Related papers (2021-01-30T07:16:53Z)
A Survey on Knowledge Graph-Based Recommender Systems [65.50486149662564]
We conduct a systematical survey of knowledge graph-based recommender systems. We focus on how the papers utilize the knowledge graph for accurate and explainable recommendation. We introduce datasets used in these works.
arXiv Detail & Related papers (2020-02-28T02:26:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.