Are Machine Programming Systems using Right Source-Code Measures to
Select Code Repositories?
- URL: http://arxiv.org/abs/2209.11946v1
- Date: Sat, 24 Sep 2022 07:34:18 GMT
- Title: Are Machine Programming Systems using Right Source-Code Measures to
Select Code Repositories?
- Authors: Niranjan Hasabnis
- Abstract summary: Machine programming (MP) is an emerging field at the intersection of deterministic and probabilistic computing.
MP systems often rely on vast amount of open-source code to learn interesting properties about code and programming.
MP systems either do not consider quality of code repositories or use atypical quality measures.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine programming (MP) is an emerging field at the intersection of
deterministic and probabilistic computing, and it aims to assist software and
hardware engineers, among other applications. Along with powerful compute
resources, MP systems often rely on vast amount of open-source code to learn
interesting properties about code and programming and solve problems in the
areas of debugging, code recommendation, auto-completion, etc. Unfortunately,
several of the existing MP systems either do not consider quality of code
repositories or use atypical quality measures than those typically used in
software engineering community to select them. As such, impact of quality of
code repositories on the performance of these systems needs to be studied.
In this preliminary paper, we evaluate impact of different quality
repositories on the performance of a candidate MP system. Towards that
objective, we develop a framework, named GitRank, to rank open-source
repositories on quality, maintainability, and popularity by leveraging existing
research on this topic. We then apply GitRank to evaluate correlation between
the quality measures used by the candidate MP system and the quality measures
used by our framework. Our preliminary results reveal some correlation between
the quality measures used in GitRank and ControlFlag's performance, suggesting
that some of the measures used in GitRank are applicable to ControlFlag. But it
also raises questions around right quality measures for code repositories used
in MP systems. We believe that our findings also generate interesting insights
towards code quality measures that affect performance of MP systems.
Related papers
- DOCE: Finding the Sweet Spot for Execution-Based Code Generation [69.5305729627198]
We propose a comprehensive framework that includes candidate generation, $n$-best reranking, minimum Bayes risk (MBR) decoding, and self-ging as the core components.
Our findings highlight the importance of execution-based methods and the difference gap between execution-based and execution-free methods.
arXiv Detail & Related papers (2024-08-25T07:10:36Z) - CodeRAG-Bench: Can Retrieval Augment Code Generation? [78.37076502395699]
We conduct a systematic, large-scale analysis of code generation using retrieval-augmented generation.
We first curate a comprehensive evaluation benchmark, CodeRAG-Bench, encompassing three categories of code generation tasks.
We examine top-performing models on CodeRAG-Bench by providing contexts retrieved from one or multiple sources.
arXiv Detail & Related papers (2024-06-20T16:59:52Z) - Code Agents are State of the Art Software Testers [10.730852617039451]
We investigate the capability of LLM-based Code Agents for formalizing user issues into test cases.
We propose a novel benchmark based on popular GitHub repositories, containing real-world issues, ground-truth patches, and golden tests.
We find that LLMs generally perform surprisingly well at generating relevant test cases with Code Agents designed for code repair.
arXiv Detail & Related papers (2024-06-18T14:54:37Z) - InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language Models [56.723509505549536]
InfiBench is the first large-scale freeform question-answering (QA) benchmark for code to our knowledge.
It comprises 234 carefully selected high-quality Stack Overflow questions that span across 15 programming languages.
We conduct a systematic evaluation for over 100 latest code LLMs on InfiBench, leading to a series of novel and insightful findings.
arXiv Detail & Related papers (2024-03-11T02:06:30Z) - RepoAgent: An LLM-Powered Open-Source Framework for Repository-level
Code Documentation Generation [79.83270415843857]
We introduce RepoAgent, a large language model powered open-source framework aimed at proactively generating, maintaining, and updating code documentation.
We have validated the effectiveness of our approach, showing that RepoAgent excels in generating high-quality repository-level documentation.
arXiv Detail & Related papers (2024-02-26T15:39:52Z) - CodePori: Large-Scale System for Autonomous Software Development Using Multi-Agent Technology [4.2990995991059275]
Large Language Models (LLMs) and Generative Pre-trained Transformers (GPTs) have transformed the field of Software Engineering.
We introduce CodePori, a novel system designed to automate code generation for large and complex software projects.
Results: CodePori is able to generate running code for large-scale projects, aligned with the typical software development process.
arXiv Detail & Related papers (2024-02-02T13:42:50Z) - MAgIC: Investigation of Large Language Model Powered Multi-Agent in
Cognition, Adaptability, Rationality and Collaboration [102.41118020705876]
Large Language Models (LLMs) have marked a significant advancement in the field of natural language processing.
As their applications extend into multi-agent environments, a need has arisen for a comprehensive evaluation framework.
This work introduces a novel benchmarking framework specifically tailored to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z) - Finding Software Vulnerabilities in Open-Source C Projects via Bounded
Model Checking [2.9129603096077332]
We advocate that bounded model-checking techniques can efficiently detect vulnerabilities in general software systems.
We have developed and evaluated a methodology to verify large software systems using a state-of-the-art bounded model checker.
arXiv Detail & Related papers (2023-11-09T11:25:24Z) - State-Of-The-Practice in Quality Assurance in Java-Based Open Source
Software Development [3.4800665691198565]
We investigate whether and how quality assurance approaches are being used in conjunction in the development of 1,454 popular open source software projects on GitHub.
Our study indicates that typically projects do not follow all quality assurance practices together with high intensity.
In general, our study provides a deeper understanding of how existing quality assurance approaches are currently being used in Java-based open source software development.
arXiv Detail & Related papers (2023-06-16T07:43:11Z) - Lessons from Formally Verified Deployed Software Systems (Extended version) [65.69802414600832]
This article examines a range of projects, in various application areas, that have produced formally verified systems and deployed them for actual use.
It considers the technologies used, the form of verification applied, the results obtained, and the lessons that the software industry should draw regarding its ability to benefit from formal verification techniques and tools.
arXiv Detail & Related papers (2023-01-05T18:18:46Z) - GitRank: A Framework to Rank GitHub Repositories [0.0]
Open-source repositories provide wealth of information and are increasingly being used to build artificial intelligence (AI) based systems.
In this hackathon, we utilize known code quality measures and GrimoireLab toolkit to implement a framework, named GitRank, to rank open-source repositories on three different criteria.
arXiv Detail & Related papers (2022-05-04T23:42:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.