Related papers: Increasing, not Diminishing: Investigating the Returns of Highly Maintainable Code

Increasing, not Diminishing: Investigating the Returns of Highly Maintainable Code

URL: http://arxiv.org/abs/2401.13407v1
Date: Wed, 24 Jan 2024 12:05:06 GMT
Title: Increasing, not Diminishing: Investigating the Returns of Highly Maintainable Code
Authors: Markus Borg and Ilyana Pruvost and Enys Mones and Adam Tornhill
Abstract summary: We study the association between code quality on the one hand, and defect count and implementation time on the other hand. We introduce a value-creation model, derived from regression analyses, to explore relative changes from a baseline. We discuss the findings within the context of the "broken windows" theory and recommend organizations to diligently prevent the introduction of code smells in files with high churn.
Score: 6.031345629422313
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Understanding and effectively managing Technical Debt (TD) remains a vital challenge in software engineering. While many studies on code-level TD have been published, few illustrate the business impact of low-quality source code. In this study, we combine two publicly available datasets to study the association between code quality on the one hand, and defect count and implementation time on the other hand. We introduce a value-creation model, derived from regression analyses, to explore relative changes from a baseline. Our results show that the associations vary across different intervals of code quality. Furthermore, the value model suggests strong non-linearities at the extremes of the code quality spectrum. Most importantly, the model suggests amplified returns on investment in the upper end. We discuss the findings within the context of the "broken windows" theory and recommend organizations to diligently prevent the introduction of code smells in files with high churn. Finally, we argue that the value-creation model can be used to initiate discussions regarding the return on investment in refactoring efforts.

Related papers

VerIF: Verification Engineering for Reinforcement Learning in Instruction Following [55.60192044049083]
Reinforcement learning with verifiable rewards (RLVR) has become a key technique for enhancing large language models (LLMs)<n>We propose VerIF, a verification method that combines rule-based code verification with LLM-based verification from a large reasoning model.<n>We apply RL training with VerIF to two models, achieving significant improvements across several representative instruction-following benchmarks.
arXiv Detail & Related papers (2025-06-11T17:10:36Z)
Towards A Generalist Code Embedding Model Based On Massive Data Synthesis [35.04242699869519]
We introduce textbfCodeR (underlineCode underlineRetrieval), a state-of-the-art embedding model for general-purpose code retrieval.<n>The superior performance of CodeR is built upon CodeR-Pile, a large-scale synthetic dataset constructed under the DRU principle.
arXiv Detail & Related papers (2025-05-19T04:37:53Z)
OpenCodeReasoning: Advancing Data Distillation for Competitive Coding [61.15402517835137]
We build a supervised fine-tuning (SFT) dataset to achieve state-of-the-art coding capability results in models of various sizes. Our models use only SFT to achieve 61.8% on LiveCodeBench and 24.6% on CodeContests, surpassing alternatives trained with reinforcement learning.
arXiv Detail & Related papers (2025-04-02T17:50:31Z)
Code Summarization Beyond Function Level [0.213063058314067]
This study investigated the effectiveness of code summarization models beyond the function level. The fine-tuned state-of-the-art CodeT5+ base model excelled in code summarization. Repository-level summarization exhibited promising potential but requires significant computational resources.
arXiv Detail & Related papers (2025-02-23T20:31:21Z)
Learning to Solve and Verify: A Self-Play Framework for Code and Test Generation [69.62857948698436]
Recent advances in large language models (LLMs) have improved their performance on coding benchmarks. However, improvement is plateauing due to the exhaustion of readily available high-quality data. We propose Sol-Ver, a self-play solver-verifier framework that jointly improves a single model's code and test generation capacity.
arXiv Detail & Related papers (2025-02-20T18:32:19Z)
SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models [54.78329741186446]
We propose a novel paradigm that uses a code-based critic model to guide steps including question-code data construction, quality control, and complementary evaluation. Experiments across both in-domain and out-of-domain benchmarks in English and Chinese demonstrate the effectiveness of the proposed paradigm.
arXiv Detail & Related papers (2024-08-28T06:33:03Z)
What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions. We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types. We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z)
Does Your Neural Code Completion Model Use My Code? A Membership Inference Approach [66.51005288743153]
We investigate the legal and ethical issues of current neural code completion models. We tailor a membership inference approach (termed CodeMI) that was originally crafted for classification tasks. We evaluate the effectiveness of this adapted approach across a diverse array of neural code completion models.
arXiv Detail & Related papers (2024-04-22T15:54:53Z)
Quantifying Contamination in Evaluating Code Generation Capabilities of Language Models [27.24738197172374]
Large language models have achieved remarkable performance on various code generation benchmarks. There have been growing concerns regarding potential contamination of these benchmarks as they may be leaked into pretraining and finetuning data. We show that there are substantial overlap between popular code generation benchmarks and open training corpus, and models perform significantly better on the subset of the benchmarks where similar solutions are seen during training.
arXiv Detail & Related papers (2024-03-06T21:45:35Z)
Improving the Learning of Code Review Successive Tasks with Cross-Task Knowledge Distillation [1.0878040851638]
We introduce a novel deep-learning architecture, named DISCOREV, which employs cross-task knowledge distillation to address these tasks simultaneously. We show that our approach generates better review comments, as measured by the BLEU score, as well as more accurate code refinement according to the CodeBLEU score.
arXiv Detail & Related papers (2024-02-03T07:02:22Z)
Evaluation of large language models for assessing code maintainability [4.2909314120969855]
We investigate the association between the cross-entropy of code generated by ten different models and quality aspects. Our results show that, controlling for the number of logical lines of codes, cross-entropy computed by LLMs is indeed a predictor of maintainability on a class level. While the complexity of LLMs affects the range of cross-entropy, this plays a significant role in predicting maintainability aspects.
arXiv Detail & Related papers (2024-01-23T12:29:42Z)
LLM-Assisted Code Cleaning For Training Accurate Code Generators [53.087019724256606]
We investigate data quality for code and find that making the code more structured and readable leads to improved code generation performance of the system. We build a novel data-cleaning pipeline that uses these principles to transform existing programs. We evaluate our approach on two challenging algorithmic code generation benchmarks and find that fine-tuning CodeLLaMa-7B improves the performance by up to 30% compared to fine-tuning on the original dataset.
arXiv Detail & Related papers (2023-11-25T02:45:50Z)
DA-VEGAN: Differentiably Augmenting VAE-GAN for microstructure reconstruction from extremely small data sets [110.60233593474796]
DA-VEGAN is a model with two central innovations. A $beta$-variational autoencoder is incorporated into a hybrid GAN architecture. A custom differentiable data augmentation scheme is developed specifically for this architecture.
arXiv Detail & Related papers (2023-02-17T08:49:09Z)
Multifidelity Reinforcement Learning with Control Variates [3.2895195535353317]
In many computational science and engineering applications, the output of a system of interest corresponding to a given input can be queried at different levels of fidelity with different costs. We study the reinforcement learning problem in the presence of multiple environments with different levels of fidelity for a given control task. A multifidelity estimator that exploits the cross-correlations between the low- and high-fidelity returns is proposed to reduce the variance in the estimation of the state-action value function.
arXiv Detail & Related papers (2022-06-10T15:01:37Z)
Detecting and adapting to crisis pattern with context based Deep Reinforcement Learning [6.224519494738852]
We present an innovative DRL framework consisting in two sub-networks fed respectively with portfolio strategies past performances and standard deviations as well as additional contextual features. Results on test set show this approach substantially over-performs traditional portfolio optimization methods like Markowitz and is able to detect and anticipate crisis like the current Covid one.
arXiv Detail & Related papers (2020-09-07T12:11:08Z)
A Transformer-based Approach for Source Code Summarization [86.08359401867577]
We learn code representation for summarization by modeling the pairwise relationship between code tokens. We show that despite the approach is simple, it outperforms the state-of-the-art techniques by a significant margin.
arXiv Detail & Related papers (2020-05-01T23:29:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.