Related papers: Are Machine Programming Systems using Right Source-Code Measures to Select Code Repositories?

Are Machine Programming Systems using Right Source-Code Measures to Select Code Repositories?

URL: http://arxiv.org/abs/2209.11946v1
Date: Sat, 24 Sep 2022 07:34:18 GMT
Title: Are Machine Programming Systems using Right Source-Code Measures to Select Code Repositories?
Authors: Niranjan Hasabnis
Abstract summary: Machine programming (MP) is an emerging field at the intersection of deterministic and probabilistic computing. MP systems often rely on vast amount of open-source code to learn interesting properties about code and programming. MP systems either do not consider quality of code repositories or use atypical quality measures.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine programming (MP) is an emerging field at the intersection of deterministic and probabilistic computing, and it aims to assist software and hardware engineers, among other applications. Along with powerful compute resources, MP systems often rely on vast amount of open-source code to learn interesting properties about code and programming and solve problems in the areas of debugging, code recommendation, auto-completion, etc. Unfortunately, several of the existing MP systems either do not consider quality of code repositories or use atypical quality measures than those typically used in software engineering community to select them. As such, impact of quality of code repositories on the performance of these systems needs to be studied. In this preliminary paper, we evaluate impact of different quality repositories on the performance of a candidate MP system. Towards that objective, we develop a framework, named GitRank, to rank open-source repositories on quality, maintainability, and popularity by leveraging existing research on this topic. We then apply GitRank to evaluate correlation between the quality measures used by the candidate MP system and the quality measures used by our framework. Our preliminary results reveal some correlation between the quality measures used in GitRank and ControlFlag's performance, suggesting that some of the measures used in GitRank are applicable to ControlFlag. But it also raises questions around right quality measures for code repositories used in MP systems. We believe that our findings also generate interesting insights towards code quality measures that affect performance of MP systems.

Related papers

A Note on Code Quality Score: LLMs for Maintainable Large Codebases [1.9989195565248983]
This paper introduces Code Quality Score (CQS) system to automatically detect issues with a set of code changes.<n>At its core, the CQS system is powered by two Llama3 models, fine-tuned (with SFT and offline RL approaches)<n>To maintain good user experience, we layer the system with hand-crafted rules to filter out incorrect responses/hallucinations.
arXiv Detail & Related papers (2025-08-01T21:09:45Z)
CoQuIR: A Comprehensive Benchmark for Code Quality-Aware Information Retrieval [31.817325318218003]
CoQuIR is the first large-scale, multilingual benchmark designed to evaluate quality-aware code retrieval.<n>CoQuIR provides fine-grained quality annotations for 42,725 queries and 134,907 code snippets in 11 programming languages.
arXiv Detail & Related papers (2025-05-31T13:00:17Z)
SweRank: Software Issue Localization with Code Ranking [109.3289316191729]
SweRank is an efficient retrieve-and-rerank framework for software issue localization.<n>We construct SweLoc, a large-scale dataset curated from public GitHub repositories.<n>We show that SweRank achieves state-of-the-art performance, outperforming both prior ranking models and costly agent-based systems.
arXiv Detail & Related papers (2025-05-07T19:44:09Z)
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning [57.09163579304332]
We introduce PaperCoder, a framework that transforms machine learning papers into functional code repositories. PaperCoder operates in three stages: planning, designs the system architecture with diagrams, identifies file dependencies, and generates configuration files. We then evaluate PaperCoder on generating code implementations from machine learning papers based on both model-based and human evaluations.
arXiv Detail & Related papers (2025-04-24T01:57:01Z)
QualiTagger: Automating software quality detection in issue trackers [4.917423556150366]
This research uses cutting edge models like Transformers to identify what text is usually associated with different quality properties. We also study the distribution of such qualities in issue trackers from openly accessible software repositories.
arXiv Detail & Related papers (2025-04-15T10:40:40Z)
A Purpose-oriented Study on Open-source Software Commits and Their Impacts on Software Quality [0.0]
We categorize commits, train prediction models to automate the classification, and investigate how commit quality is impacted by commits of different purposes. By identifying these impacts, we will establish a new set of guidelines for committing changes that will improve the quality.
arXiv Detail & Related papers (2025-03-04T03:14:57Z)
DOCE: Finding the Sweet Spot for Execution-Based Code Generation [69.5305729627198]
We propose a comprehensive framework that includes candidate generation, $n$-best reranking, minimum Bayes risk (MBR) decoding, and self-ging as the core components. Our findings highlight the importance of execution-based methods and the difference gap between execution-based and execution-free methods.
arXiv Detail & Related papers (2024-08-25T07:10:36Z)
CodeRAG-Bench: Can Retrieval Augment Code Generation? [78.37076502395699]
We conduct a systematic, large-scale analysis of code generation using retrieval-augmented generation. We first curate a comprehensive evaluation benchmark, CodeRAG-Bench, encompassing three categories of code generation tasks. We examine top-performing models on CodeRAG-Bench by providing contexts retrieved from one or multiple sources.
arXiv Detail & Related papers (2024-06-20T16:59:52Z)
InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language Models [56.723509505549536]
InfiBench is the first large-scale freeform question-answering (QA) benchmark for code to our knowledge. It comprises 234 carefully selected high-quality Stack Overflow questions that span across 15 programming languages. We conduct a systematic evaluation for over 100 latest code LLMs on InfiBench, leading to a series of novel and insightful findings.
arXiv Detail & Related papers (2024-03-11T02:06:30Z)
RepoAgent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation Generation [79.83270415843857]
We introduce RepoAgent, a large language model powered open-source framework aimed at proactively generating, maintaining, and updating code documentation. We have validated the effectiveness of our approach, showing that RepoAgent excels in generating high-quality repository-level documentation.
arXiv Detail & Related papers (2024-02-26T15:39:52Z)
CodePori: Large-Scale System for Autonomous Software Development Using Multi-Agent Technology [4.2990995991059275]
Large Language Models (LLMs) and Generative Pre-trained Transformers (GPTs) have transformed the field of Software Engineering. We introduce CodePori, a novel system designed to automate code generation for large and complex software projects. Results: CodePori is able to generate running code for large-scale projects, aligned with the typical software development process.
arXiv Detail & Related papers (2024-02-02T13:42:50Z)
MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration [102.41118020705876]
Large Language Models (LLMs) have marked a significant advancement in the field of natural language processing. As their applications extend into multi-agent environments, a need has arisen for a comprehensive evaluation framework. This work introduces a novel benchmarking framework specifically tailored to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z)
Finding Software Vulnerabilities in Open-Source C Projects via Bounded Model Checking [2.9129603096077332]
We advocate that bounded model-checking techniques can efficiently detect vulnerabilities in general software systems. We have developed and evaluated a methodology to verify large software systems using a state-of-the-art bounded model checker.
arXiv Detail & Related papers (2023-11-09T11:25:24Z)
An End-to-End System for Reproducibility Assessment of Source Code Repositories via Their Readmes [0.138120109831448]
We propose an end-to-end system that operates on the Readme file of the source code repositories. The system generates scores based on a custom function to combine section scores. It has an advantage regarding explainability since one can directly relate the score to the sections of Readme files.
arXiv Detail & Related papers (2023-10-14T18:01:11Z)
State-Of-The-Practice in Quality Assurance in Java-Based Open Source Software Development [3.4800665691198565]
We investigate whether and how quality assurance approaches are being used in conjunction in the development of 1,454 popular open source software projects on GitHub. Our study indicates that typically projects do not follow all quality assurance practices together with high intensity. In general, our study provides a deeper understanding of how existing quality assurance approaches are currently being used in Java-based open source software development.
arXiv Detail & Related papers (2023-06-16T07:43:11Z)
Lessons from Formally Verified Deployed Software Systems (Extended version) [65.69802414600832]
This article examines a range of projects, in various application areas, that have produced formally verified systems and deployed them for actual use. It considers the technologies used, the form of verification applied, the results obtained, and the lessons that the software industry should draw regarding its ability to benefit from formal verification techniques and tools.
arXiv Detail & Related papers (2023-01-05T18:18:46Z)
GitRank: A Framework to Rank GitHub Repositories [0.0]
Open-source repositories provide wealth of information and are increasingly being used to build artificial intelligence (AI) based systems. In this hackathon, we utilize known code quality measures and GrimoireLab toolkit to implement a framework, named GitRank, to rank open-source repositories on three different criteria.
arXiv Detail & Related papers (2022-05-04T23:42:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.