Related papers: MORCoRA: Multi-Objective Refactoring Recommendation Considering Review Availability

MORCoRA: Multi-Objective Refactoring Recommendation Considering Review Availability

URL: http://arxiv.org/abs/2408.06568v1
Date: Tue, 13 Aug 2024 02:08:16 GMT
Title: MORCoRA: Multi-Objective Refactoring Recommendation Considering Review Availability
Authors: Lei Chen, Shinpei Hayashi,
Abstract summary: It is essential to ensure that the searched sequence of sequences can be reviewed promptly. We propose MORCoRA, a multi-objective search-based technique that can search for code quality, semantic preserved, and high review availability.
Score: 6.439206681270567
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Background: Search-based refactoring involves searching for a sequence of refactorings to achieve specific objectives. Although a typical objective is improving code quality, a different perspective is also required; the searched sequence must undergo review before being applied and may not be applied if the review fails or is postponed due to no proper reviewers. Aim: Therefore, it is essential to ensure that the searched sequence of refactorings can be reviewed promptly by reviewers who meet two criteria: 1) having enough expertise and 2) being free of heavy workload. The two criteria are regarded as the review availability of the refactoring sequence. Method: We propose MORCoRA, a multi-objective search-based technique that can search for code quality improvable, semantic preserved, and high review availability possessed refactoring sequences and corresponding proper reviewers. Results: We evaluate MORCoRA on six open-source repositories. The quantitative analysis reveals that MORCoRA can effectively recommend refactoring sequences that fit the requirements. The qualitative analysis demonstrates that the refactorings recommended by MORCoRA can enhance code quality and effectively address code smells. Furthermore, the recommended reviewers for those refactorings possess high expertise and are available to review. Conclusions: We recommend that refactoring recommenders consider both the impact on quality improvement and the developer resources required for review when recommending refactorings.

Related papers

CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models [97.18215355266143]
We introduce a holistic code critique benchmark for Large Language Models (LLMs) called CodeCriticBench. Specifically, our CodeCriticBench includes two mainstream code tasks (i.e., code generation and code QA) with different difficulties. Besides, the evaluation protocols include basic critique evaluation and advanced critique evaluation for different characteristics.
arXiv Detail & Related papers (2025-02-23T15:36:43Z)
Automated Refactoring of Non-Idiomatic Python Code: A Differentiated Replication with LLMs [54.309127753635366]
We present the results of a replication study in which we investigate GPT-4 effectiveness in recommending and suggesting idiomatic actions. Our findings underscore the potential of LLMs to achieve tasks where, in the past, implementing recommenders based on complex code analyses was required.
arXiv Detail & Related papers (2025-01-28T15:41:54Z)
RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques [59.861013614500024]
We introduce a new benchmark designed to assess the critique capabilities of Large Language Models (LLMs) Unlike existing benchmarks, which typically function in an open-loop fashion, our approach employs a closed-loop methodology that evaluates the quality of corrections generated from critiques.
arXiv Detail & Related papers (2025-01-24T13:48:10Z)
Unanswerability Evaluation for Retrieval Augmented Generation [74.3022365715597]
UAEval4RAG is a framework designed to evaluate whether RAG systems can handle unanswerable queries effectively. We define a taxonomy with six unanswerable categories, and UAEval4RAG automatically synthesizes diverse and challenging queries.
arXiv Detail & Related papers (2024-12-16T19:11:55Z)
JudgeRank: Leveraging Large Language Models for Reasoning-Intensive Reranking [81.88787401178378]
We introduce JudgeRank, a novel agentic reranker that emulates human cognitive processes when assessing document relevance. We evaluate JudgeRank on the reasoning-intensive BRIGHT benchmark, demonstrating substantial performance improvements over first-stage retrieval methods. In addition, JudgeRank performs on par with fine-tuned state-of-the-art rerankers on the popular BEIR benchmark, validating its zero-shot generalization capability.
arXiv Detail & Related papers (2024-10-31T18:43:12Z)
Deciphering Refactoring Branch Dynamics in Modern Code Review: An Empirical Study on Qt [5.516979718589074]
This study aims to understand the review process for changes in the Refactor branch and to identify what developers care about when reviewing code in this branch. We find that reviews involving from the Refactor branch take significantly less time to resolve in terms of code review. Additionally, documentation of developer intent is notably sparse within the Refactor branch compared to other branches.
arXiv Detail & Related papers (2024-10-07T01:18:56Z)
Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging Tasks [68.49251303172674]
State-of-the-art large language models (LLMs) exhibit impressive problem-solving capabilities but may struggle with complex reasoning and factual correctness. Existing methods harness the strengths of chain-of-thought and retrieval-augmented generation (RAG) to decompose a complex problem into simpler steps and apply retrieval to improve factual correctness. We introduce Critic-guided planning with Retrieval-augmentation, CR-Planner, a novel framework that leverages fine-tuned critic models to guide both reasoning and retrieval processes through planning.
arXiv Detail & Related papers (2024-10-02T11:26:02Z)
RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation [65.5353313491402]
We introduce RethinkMCTS, which employs the Monte Carlo Tree Search (MCTS) algorithm to conduct thought-level searches before generating code. We construct verbal feedback from fine-turbo code execution feedback to refine erroneous thoughts during the search. We demonstrate that RethinkMCTS outperforms previous search-based and feedback-based code generation baselines.
arXiv Detail & Related papers (2024-09-15T02:07:28Z)
DOCE: Finding the Sweet Spot for Execution-Based Code Generation [69.5305729627198]
We propose a comprehensive framework that includes candidate generation, $n$-best reranking, minimum Bayes risk (MBR) decoding, and self-ging as the core components. Our findings highlight the importance of execution-based methods and the difference gap between execution-based and execution-free methods.
arXiv Detail & Related papers (2024-08-25T07:10:36Z)
In Search of Metrics to Guide Developer-Based Refactoring Recommendations [13.063733696956678]
Motivation is a well-established approach to improving source code quality without compromising its external behavior. We propose an empirical study into the metrics that study the developer's willingness to apply operations. We will quantify the value of product and process metrics in grasping developers' motivations to perform.
arXiv Detail & Related papers (2024-07-25T16:32:35Z)
Unified Active Retrieval for Retrieval Augmented Generation [69.63003043712696]
In Retrieval-Augmented Generation (RAG), retrieval is not always helpful and applying it to every instruction is sub-optimal. Existing active retrieval methods face two challenges: 1. They usually rely on a single criterion, which struggles with handling various types of instructions. They depend on specialized and highly differentiated procedures, and thus combining them makes the RAG system more complicated.
arXiv Detail & Related papers (2024-06-18T12:09:02Z)
On Unified Prompt Tuning for Request Quality Assurance in Public Code Review [19.427661961488404]
We propose a unified framework called UniPCR to complete developer-based request quality assurance (i.e., predicting request necessity and recommending tags subtask) under a Masked Language Model (MLM) Experimental results on the Public Code Review dataset for the time span 2011-2022 demonstrate that our UniPCR framework adapts to the two subtasks and outperforms comparable accuracy-based results with state-of-the-art methods for request quality assurance.
arXiv Detail & Related papers (2024-04-11T17:41:28Z)
State of Refactoring Adoption: Better Understanding Developer Perception of Refactoring [5.516979718589074]
We aim to explore how developers document their activities during the software life cycle. We call such activity Self-Affirmed Refactoring (SAR), which indicates developers' documentation of their activities. We propose an approach to identify whether a commit describes developer-related events to classify them according to the common quality improvement categories.
arXiv Detail & Related papers (2023-06-09T16:38:20Z)
Recommendation Systems with Distribution-Free Reliability Guarantees [83.80644194980042]
We show how to return a set of items rigorously guaranteed to contain mostly good items. Our procedure endows any ranking model with rigorous finite-sample control of the false discovery rate. We evaluate our methods on the Yahoo! Learning to Rank and MSMarco datasets.
arXiv Detail & Related papers (2022-07-04T17:49:25Z)
How We Refactor and How We Document it? On the Use of Supervised Machine Learning Algorithms to Classify Refactoring Documentation [25.626914797750487]
Refactoring is the art of improving the design of a system without altering its external behavior. This study categorizes commits into 3 categories, namely, Internal QA, External QA, and Code Smell Resolution, along with the traditional BugFix and Functional categories. To better understand our classification results, we analyzed commit messages to extract patterns that developers regularly use to describe their smells.
arXiv Detail & Related papers (2020-10-26T20:33:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.